Implementation and Evaluation of String B-Tree
نویسندگان
چکیده
String B-tree is a combination of B-tree and Patricia tries for internal-node indices. Instead of storing prefix compressed keys at each index node, each key is stored in full in a consecutive sequence of data blocks, and each downward-traversal decision is made by a combination of Patricia trie search and the consultation of a single key. String B-tree has the same worst case performance as B-tree but it manages unbounded-length strings and performs much more powerful search operations such as the ones supported by suffix tree. In this project, we implemented a static string B-tree for string search in external memory. The operations supported include Prefix Search, Range Query, and Substring Search. Our experiment compared the query performance of String B-tree with B+-Tree and linear search. The result shows String B-tree outperforms B+-tree by reducing disk access admirably.
منابع مشابه
Implementation and Evaluation of an External Memory String B-Tree
Preprocessing texts of huge size to answer substring queries is not trivial whenever considering realistic models. We approach this problem by offering an efficient implementation of the String B-Tree data structure, which aims to solve the substring search problem under the dynamic operations. We achieve optimal space usage for the Patricia Tries by representing them via multiarray encoding an...
متن کاملA Space Efficient Persistent Implementation of an Index for DNA Sequences
Due to newly developed high-throughput technologies for DNA sequencing, the number of fully sequenced species increases rapidly. String databases holding these sequences are very large. On the eld of molecular biology the handling of large string data which cannot be broken in words is a great challenge. Hereby the most important string operation is the approximate substring match. This type of...
متن کاملاثرات کوانتومی خلأ برای یک ریسمان بوزونی جرمدار در حضور میدان پسزمینه
We study the Casimir effect for a Bosonic string extended between D-branes, and living in a flat space with an antisymmetric background B-field. We find the Casimir energy as a function of the B-field, and the mass-parameter of the string, and accordingly we obtain a B-dependence correction term to the ground-state mass of the string. We show that for sufficiently large B-field, the ground stat...
متن کاملThe (non-)existence of perfect codes in Lucas cubes
A Fibonacci string of length $n$ is a binary string $b = b_1b_2ldots b_n$ in which for every $1 leq i < n$, $b_icdot b_{i+1} = 0$. In other words, a Fibonacci string is a binary string without 11 as a substring. Similarly, a Lucas string is a Fibonacci string $b_1b_2ldots b_n$ that $b_1cdot b_n = 0$. For a natural number $ngeq1$, a Fibonacci cube of dimension $n$ is denoted by $Gamma_n$ and i...
متن کاملAdapting Tree Distance to Answer Retrieval and Parser Evaluation
The results of experiments on the application of tree-distance to an answer-retrieval task are reported. Various parameters in the definitions of tree-distance are considered, including wholevs-sub tree, node weighting, wild cards and lexical emphasis. The results show that improving parse-quality maps to improved performance on this tree-distance answer-retrieval task. It also shown that one o...
متن کامل